<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<html><head><title>Programming in Lua : 21.2.2</title>
<link rel="stylesheet" type="text/css" href="pil.css">
</head>
<body>
<p class="warning">
<A HREF="contents.html"><IMG TITLE="Programming in Lua (first edition)" SRC="capa.jpg" ALT="" ALIGN="left"></A>This first edition was written for Lua 5.0. While still largely relevant for later versions, there are some differences.<BR>The third edition targets Lua 5.2 and is available at <A HREF="http://www.amazon.com/exec/obidos/ASIN/859037985X/theprogrammil3-20">Amazon</A> and other bookstores.<BR>By buying the book, you also help to <A HREF="../donations.html">support the Lua project</A>.
</p>
<table width="100%" class="nav">
<tr>
<td width="10%" align="left"><a href="21.2.1.html"><img border="0" src="left.png" alt="Previous"></a></td>
<td width="80%" align="center"><a href="contents.html"><font face="Helvetica,Arial,sanserif">
<font color="gray">Programming in </font><font color="blue"> Lua</font>
</font></a></td>
<td width="10%" align="right"><a href="21.3.html"><img border="0" src="right.png" alt="Next"></a></td>
</tr>
<tr>
<td width="10%" align="left"></td>
<td width="80%" align="center"><a href="contents.html#P3">Part III. The Standard Libraries</a>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
<a href="contents.html#21">Chapter 21. The I/O Library</a></td>
<td width="10%" align="right"></td></tr>
</table>
<hr/>
<p><h3>21.2.2 &ndash; Binary Files</h3>

<p>The simple model functions <code>io.input</code> and <code>io.output</code>
always open a file in text mode (the default).
In Unix,
there is no difference between binary files and text files.
But in some systems, notably Windows,
binary files must be opened with a special flag.
To handle such binary files,
you must use <code>io.open</code>,
with the letter `<code>b</code>&acute; in the mode string.

<p>Binary data in Lua are handled similarly to text.
A string in Lua may contain any bytes
and almost all functions in the libraries
can handle arbitrary bytes.
(You can even do pattern matching over binary data,
as long as the pattern does not contain a zero byte.
If you want to match the byte zero,
you can use the class <code>%z</code> instead.)

<p>Typically, you read binary data either with the <code>*all</code> pattern,
that reads the whole file, or with the pattern <em>n</em>,
that reads <em>n</em> bytes.
As a simple example,
the following program converts a text file from
DOS format to Unix format
(that is, it translates sequences of
carriage return-newlines to newlines).
It does not use the standard I/O files (<code>stdin</code>/<code>stdout</code>),
because those files are open in text mode.
Instead, it assumes that the names of the input file and the output
file are given as arguments to the program:

<pre>
    local inp = assert(io.open(arg[1], "rb"))
    local out = assert(io.open(arg[2], "wb"))
    
    local data = inp:read("*all")
    data = string.gsub(data, "\r\n", "\n")
    out:write(data)
    
    assert(out:close())
</pre>
You can call this program with the following command line:
<pre>
    > lua prog.lua file.dos file.unix
</pre>

<p>As another example, the following program prints all strings
found in a binary file.
The program assumes that a string is any zero-terminated
sequence of six or more valid characters,
where a valid character is any character accepted by the pattern
<code>validchars</code>.
In our example, that comprises the alphanumeric, the punctuation,
and the space characters.
We use concatenation and <code>string.rep</code> to create a
pattern that captures all sequences of six or more <code>validchars</code>.
The <code>%z</code> at the end of the pattern matches the byte zero at the
end of a string.
<pre>
    local f = assert(io.open(arg[1], "rb"))
    local data = f:read("*all")
    local validchars = "[%w%p%s]"
    local pattern = string.rep(validchars, 6) .. "+%z"
    for w in string.gfind(data, pattern) do
      print(w)
    end
</pre>

<p>As a last example,
the following program makes a dump of a binary file.
Again, the first program argument is the input file name;
the output goes to the standard output.
The program reads the file in chunks of 10 bytes.
For each chunk, it writes the hexadecimal representation of each byte,
and then it writes the chunk as text, changing control characters to dots.
<pre>
    local f = assert(io.open(arg[1], "rb"))
    local block = 10
    while true do
      local bytes = f:read(block)
      if not bytes then break end
      for b in string.gfind(bytes, ".") do
        io.write(string.format("%02X ", string.byte(b)))
      end
      io.write(string.rep("   ", block - string.len(bytes) + 1))
      io.write(string.gsub(bytes, "%c", "."), "\n")
    end
</pre>
Suppose we store that program in a file named <code>vip</code>;
if we apply the program to itself, with the call
<pre>
    prompt> lua vip vip
</pre>
it will produce an output like this (in a Unix machine):
<pre>
    6C 6F 63 61 6C 20 66 20 3D 20    local f = 
    61 73 73 65 72 74 28 69 6F 2E    assert(io.
    6F 70 65 6E 28 61 72 67 5B 31    open(arg[1
    5D 2C 20 22 72 62 22 29 29 0A    ], "rb")).
               ...
    22 25 63 22 2C 20 22 2E 22 29    "%c", ".")
    2C 20 22 5C 6E 22 29 0A 65 6E    , "\n").en
    64 0A                            d.
</pre>

<hr/>
<table width="100%" class="nav">
<tr valign="top">
<td width="90%" align="left">
<small>
  Copyright &copy; 2003&ndash;2004 Roberto Ierusalimschy.  All rights reserved.
</small>
</td>
<td width="10%" align="right"><a href="21.3.html"><img border="0" src="right.png" alt="Next"></a></td>
</tr>
</table>


</body></html>

